analytic deep neural network
Consistent feature selection for analytic deep neural networks
One of the most important steps toward interpretability and explainability of neural network models is feature selection, which aims to identify the subset of relevant features. Theoretical results in the field have mostly focused on the prediction aspect of the problem with virtually no work on feature selection consistency for deep neural networks due to the model's severe nonlinearity and unidentifiability. This lack of theoretical foundation casts doubt on the applicability of deep learning to contexts where correct interpretations of the features play a central role. In this work, we investigate the problem of feature selection for analytic deep networks. We prove that for a wide class of networks, including deep feed-forward neural networks, convolutional neural networks and a major sub-class of residual neural networks, the Adaptive Group Lasso selection procedure with Group Lasso as the base estimator is selection-consistent. The work provides further evidence that Group Lasso might be inefficient for feature selection with neural networks and advocates the use of Adaptive Group Lasso over the popular Group Lasso.
Consistent feature selection for analytic deep neural networks
One of the most important steps toward interpretability and explainability of neural network models is feature selection, which aims to identify the subset of relevant features. Theoretical results in the field have mostly focused on the prediction aspect of the problem with virtually no work on feature selection consistency for deep neural networks due to the model's severe nonlinearity and unidentifiability. This lack of theoretical foundation casts doubt on the applicability of deep learning to contexts where correct interpretations of the features play a central role. In this work, we investigate the problem of feature selection for analytic deep networks. We prove that for a wide class of networks, including deep feed-forward neural networks, convolutional neural networks and a major sub-class of residual neural networks, the Adaptive Group Lasso selection procedure with Group Lasso as the base estimator is selection-consistent.
Review for NeurIPS paper: Consistent feature selection for analytic deep neural networks
Additional Feedback: Below please find my detailed comments: 1. This paper do not consider the feature selection under high dimensional setting. This paper consider an easier problem setting in which the total number of feature is fixed and does not scale with the number of data points. It is important to study a feature selection method under high dimensional setting, where the number of total feature may grows exponentially fast w.r.t. However, there are various previous actually already gives selection consistency result, for example [1,2,3]. And the selection consistency in [4, 5] can be easily obtained with several simple arguments built upon their analysis.
Review for NeurIPS paper: Consistent feature selection for analytic deep neural networks
This paper shows that the adaptive group lasso feature selection method, more specifically a combined strategy called GL AGL, is selection consistent for a very general class of Deep Neural Networks, provided the the DNN interacts with the input through a finite set of linear units. This is an important property since it provides a guarantee that, with enough training examples, GL AGL will effectively identify the set of relevant inputs; making the DNN more interpretable. The general structure of the proof follows the analysis of high-dimensional linear models, but new technical elements are introduced to tackle de difficulties introduced when the linear transformation of the first layer is followed by a sequence of non-linear transformations typically used in DNNs. Finally the numerical experiments provide evidence that the popular group lasso method might be an inefficient feature selection method for DNNs.
Consistent feature selection for analytic deep neural networks
One of the most important steps toward interpretability and explainability of neural network models is feature selection, which aims to identify the subset of relevant features. Theoretical results in the field have mostly focused on the prediction aspect of the problem with virtually no work on feature selection consistency for deep neural networks due to the model's severe nonlinearity and unidentifiability. This lack of theoretical foundation casts doubt on the applicability of deep learning to contexts where correct interpretations of the features play a central role. In this work, we investigate the problem of feature selection for analytic deep networks. We prove that for a wide class of networks, including deep feed-forward neural networks, convolutional neural networks and a major sub-class of residual neural networks, the Adaptive Group Lasso selection procedure with Group Lasso as the base estimator is selection-consistent.